Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
下一代物理科学涉及机器人科学家 - 自主物理科学系统,能够在封闭环中实验设计,执行和分析。这样的系统已显示出对科学探索和发现的现实成功,包括首次发现一流的材料。为了构建和使用这些系统,下一代劳动力需要在不同领域的专业知识,包括ML,控制系统,测量科学,材料合成,决策理论等。但是,教育滞后。教育工作者需要一个低成本,易于使用的平台来教授所需的技能。行业还可以使用这样的平台来开发和评估自主物理科学方法论。我们介绍了科学教育的下一代,这是建立低成本自治科学家的套件。该套件在马里兰州大学的两门课程中用于教授本科和研究生自治物理科学。我们以自主模型探索,优化和确定的双重任务来讨论其在课程中的用途及其更大的能力,并以自主实验的“发现”为例。
translated by 谷歌翻译
罕见的事件搜索使我们能够通过利用专门的大型探测器来搜索无法与其他方式无法访问的新物理学。机器学习提供了一种新工具来最大化这些检测器提供的信息。信息很少,这迫使这些算法从最低级别的数据开始,并利用检测器中的所有对称性来产生结果。在这项工作中,我们提出了Kamnet,该Kamnet在几何深度学习和时空数据分析中实现了突破,以最大程度地提高Kamland-Zen的物理范围,Kamland-Zen是kiloton量表球形液体闪烁体检测器,以寻找中微子的中微子双β衰减($ 0 \ beta \ beta \ beta \ beta $) 。使用Kamland的简化背景模型,我们表明Kamnet在基准MC模拟上以较高的鲁棒性水平优于常规CNN。然后,我们使用模拟数据,证明了Kamnet将Kamland-Zen的敏感性提高到$ 0 \ nu \ beta \ beta \ beta $和$ 0 \ nu \ beta \ beta \ beta $的能力。这项工作的一个关键组成部分是增加了注意机制来阐明基础物理Kamnet用于背景排斥。
translated by 谷歌翻译
在过去的几十年中,人工智能(AI)和更具体地进行机械学习的应用,对物理科学进行了显着扩展。特别是,科学知情的AI或科学AI从专注于数据分析到现在控制闭环自主系统中的实验设计,仿真,执行和分析。客串(闭环自主材料勘探和优化)算法采用科学AI来解决两项任务:学习材料系统的组成结构关系,鉴定具有最佳功能性的材料组合物。通过对此进行整合,对构图相图进行了筛选的加速材料,导致发现最佳相变存储器材料。这一成功的关键是能够引导后续测量来最大化构图结构关系或相位图的知识。在这项工作中,我们调查将不同水平的先前物理知识纳入Careo的自主阶段映射的益处。这包括使用来自AFLOW存储库的AB-Initio相位边界数据,这些数据已被示出为在作为先前使用时优化Careo的搜索。
translated by 谷歌翻译
Variational inference uses optimization, rather than integration, to approximate the marginal likelihood, and thereby the posterior, in a Bayesian model. Thanks to advances in computational scalability made in the last decade, variational inference is now the preferred choice for many high-dimensional models and large datasets. This tutorial introduces variational inference from the parametric perspective that dominates these recent developments, in contrast to the mean-field perspective commonly found in other introductory texts.
translated by 谷歌翻译
Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
Unsupervised domain adaptation (UDA) for semantic segmentation is a promising task freeing people from heavy annotation work. However, domain discrepancies in low-level image statistics and high-level contexts compromise the segmentation performance over the target domain. A key idea to tackle this problem is to perform both image-level and feature-level adaptation jointly. Unfortunately, there is a lack of such unified approaches for UDA tasks in the existing literature. This paper proposes a novel UDA pipeline for semantic segmentation that unifies image-level and feature-level adaptation. Concretely, for image-level domain shifts, we propose a global photometric alignment module and a global texture alignment module that align images in the source and target domains in terms of image-level properties. For feature-level domain shifts, we perform global manifold alignment by projecting pixel features from both domains onto the feature manifold of the source domain; and we further regularize category centers in the source domain through a category-oriented triplet loss and perform target domain consistency regularization over augmented target domain images. Experimental results demonstrate that our pipeline significantly outperforms previous methods. In the commonly tested GTA5$\rightarrow$Cityscapes task, our proposed method using Deeplab V3+ as the backbone surpasses previous SOTA by 8%, achieving 58.2% in mIoU.
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
The performance of inertial navigation systems is largely dependent on the stable flow of external measurements and information to guarantee continuous filter updates and bind the inertial solution drift. Platforms in different operational environments may be prevented at some point from receiving external measurements, thus exposing their navigation solution to drift. Over the years, a wide variety of works have been proposed to overcome this shortcoming, by exploiting knowledge of the system current conditions and turning it into an applicable source of information to update the navigation filter. This paper aims to provide an extensive survey of information aided navigation, broadly classified into direct, indirect, and model aiding. Each approach is described by the notable works that implemented its concept, use cases, relevant state updates, and their corresponding measurement models. By matching the appropriate constraint to a given scenario, one will be able to improve the navigation solution accuracy, compensate for the lost information, and uncover certain internal states, that would otherwise remain unobservable.
translated by 谷歌翻译